24 research outputs found
Prompt Learning to Mitigate Catastrophic Forgetting in Cross-lingual Transfer for Open-domain Dialogue Generation
Dialogue systems for non-English languages have long been under-explored. In
this paper, we take the first step to investigate few-shot cross-lingual
transfer learning (FS-XLT) and multitask learning (MTL) in the context of
open-domain dialogue generation for non-English languages with limited data. We
observed catastrophic forgetting in both FS-XLT and MTL for all 6 languages in
our preliminary experiments. To mitigate the issue, we propose a simple yet
effective prompt learning approach that can preserve the multilinguality of
multilingual pre-trained language model (mPLM) in FS-XLT and MTL by bridging
the gap between pre-training and fine-tuning with Fixed-prompt LM Tuning and
our hand-crafted prompts. Experimental results on all 6 languages in terms of
both automatic and human evaluations demonstrate the effectiveness of our
approach. Our code is available at https://github.com/JeremyLeiLiu/XLinguDial.Comment: Accepted for presentation at SIGIR 202
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
The development of large language models (LLMs) such as ChatGPT has brought a
lot of attention recently. However, their evaluation in the benchmark academic
datasets remains under-explored due to the difficulty of evaluating the
generative outputs produced by this model against the ground truth. In this
paper, we aim to present a thorough evaluation of ChatGPT's performance on
diverse academic datasets, covering tasks like question-answering, text
summarization, code generation, commonsense reasoning, mathematical
problem-solving, machine translation, bias detection, and ethical
considerations. Specifically, we evaluate ChatGPT across 140 tasks and analyze
255K responses it generates in these datasets. This makes our work the largest
evaluation of ChatGPT in NLP benchmarks. In short, our study aims to validate
the strengths and weaknesses of ChatGPT in various tasks and provide insights
for future research using LLMs. We also report a new emergent ability to follow
multi-query instructions that we mostly found in ChatGPT and other
instruction-tuned models. Our extensive evaluation shows that even though
ChatGPT is capable of performing a wide variety of tasks, and may obtain
impressive performance in several benchmark datasets, it is still far from
achieving the ability to reliably solve many challenging tasks. By providing a
thorough assessment of ChatGPT's performance across diverse NLP tasks, this
paper sets the stage for a targeted deployment of ChatGPT-like LLMs in
real-world applications.Comment: Accepted by ACL 2023 Findings. The first three authors contributed
equall
A Complexity-theoretic Analysis of Green Pickup-and-Delivery Problems
In a Green Pickup-and-Delivery problem (GPD), vehicles traveling in a transport network achieving pickup-and-delivery tasks are in particular subject to the two \textit{green} constraints: limited vehicle fuel capacity thus short vehicle traveling range, and limited availability of refueling infrastructure for the vehicles. GPD adds additional but probably insignificant computational complexity to the classic and already NP-hard Pickup-and-Delivery problem and Vehicle Routing Problem. Nevertheless, we demonstrate in this paper an inherent intractability of these green components themselves. More precisely, we show that GPD problems whose total constraints are reduced to almost the green ones only, remain to be NP-complete in the strong sense. We figure out a specifically constrained variant of GPD that, however, is weakly NP-complete -- a practical pseudo-polynomial time algorithm solving the variant problem is identified. Insight obtained from this complexity-theoretic analysis would shed light for a deeper understanding of GPDs, and on better development of heuristics for solving these problems, leading to promisingly many real-world applications
Image Translation by Ad CycleGAN for COVID-19 X-Ray Images: A New Approach for Controllable GAN
We propose a new generative model named adaptive cycle-consistent generative adversarial network, or Ad CycleGAN to perform image translation between normal and COVID-19 positive chest X-ray images. An independent pre-trained criterion is added to the conventional Cycle GAN architecture to exert adaptive control on image translation. The performance of Ad CycleGAN is compared with the Cycle GAN without the external criterion. The quality of the synthetic images is evaluated by quantitative metrics including Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Peak Signal-to-Noise Ratio (PSNR), Universal Image Quality Index (UIQI), visual information fidelity (VIF), Frechet Inception Distance (FID), and translation accuracy. The experimental results indicate that the synthetic images generated either by the Cycle GAN or by the Ad CycleGAN have lower MSE and RMSE, and higher scores in PSNR, UIQI, and VIF in homogenous image translation (i.e., Y → Y) compared to the heterogenous image translation process (i.e., X → Y). The synthetic images by Ad CycleGAN through the heterogeneous image translation have significantly higher FID score compared to Cycle GAN (p p < 0.01). Therefore, we conclude that the Ad CycleGAN with the independent criterion can improve the accuracy of GAN image translation. The new architecture has more control on image synthesis and can help address the common class imbalance issue in machine learning methods and artificial intelligence applications with medical images
Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text Summarization
The Query-Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on the given query. A key challenge in addressing this task is the lack of large labeled data for training the summarization model. In this article, we address this challenge by exploring a series of domain adaptation techniques. Given the recent success of pre-trained transformer models in a wide range of natural language processing tasks, we utilize such models to generate abstractive summaries for the QFTS task for both single-document and multi-document scenarios. For domain adaptation, we apply a variety of techniques using pre-trained transformer-based summarization models including transfer learning, weakly supervised learning, and distant supervision. Extensive experiments on six datasets show that our proposed approach is very effective in generating abstractive summaries for
the QFTS task while setting a new state-of-the-art result in several datasets across a set of automatic and human evaluation metrics